home *** CD-ROM | disk | FTP | other *** search
Text File | 1997-11-19 | 57.3 KB | 1,187 lines |
- Linux WWW HOWTO
- by Wayne Leister, n3mtr@qis.net
- v0.82, 19 November 1997
-
- This document contains information about setting up WWW services under
- Linux (both server and client). It tries not to be a in detail manual
- but an overview and a good pointer to further information.
-
- 1. Introduction
-
- Many people are trying Linux because they are looking for a really
- good Internet capable operating system. Also, there are institutes,
- universities, non-profits, and small businesses which want to set up
- Internet sites on a small budget. This is where the WWW-HOWTO comes
- in. This document explains how to set up clients and servers for the
- largest part of the Internet - The World Wide Web.
-
- All prices in this document are stated in US dollars. This document
- assumes you are running Linux on an Intel platform. Instructions and
- product availability my vary from platform to platform. There are
- many links for downloading software in this document. Whenever
- possible use a mirror site for faster downloading and to keep the load
- down on the main server.
-
- The US government forbids US companies from exporting encryption
- stronger than 40 bit in strength. Therefore US companies will usually
- have two versions of software. The import version will usually
- support 128 bit, and the export only 40 bit. This applies to web
- browsers and servers supporting secure transactions. Another name for
- secure transactions is Secure Sockets Layer (SSL). We will refer to
- it as SSL for the rest of this document.
-
- 1.1. Copyright
-
- This document is Copyright (c) 1997 by Wayne Leister. The original
- author of this document was Peter Dreuw.(All versions prior to 0.8)
-
- This HOWTO is free documentation; you can redistribute it
- and/or modify it under the terms of the GNU General Public
- License as published by the Free Software Foundation; either
- version 2 of the License, or (at your option) any later ver¡
- sion.
-
- This document is distributed in the hope that it will be
- useful, but without any warranty; without even the implied
- warranty of merchantability or fitness for a particular pur¡
- pose. See the GNU General Public License for more details.
-
- You can obtain a copy of the GNU General Public License by
- writing to the Free Software Foundation, Inc., 675 Mass Ave,
- Cambridge, MA 02139, USA.
-
- Trademarks are owned by there respective owners.
-
- 1.2. Feedback
-
- Any feedback is welcome. I do not claim to be an expert. Some of
- this information was taken from badly written web sites; there are
- bound to be errors and omissions. But make sure you have the latest
- version before you send corrections; It may be fixed in the next
- version (see the next section for where to get the latest version).
- Send feedback to n3mtr@qis.net.
-
- 1.3. New versions of this Document
-
- New versions of this document can be retrieved in text format from
- Sunsite at <http://sunsite.unc.edu/pub/Linux/docs/HOWTO/WWW-HOWTO> and
- almost any Linux mirror site. You can view the latest HTML version on
- the web at <http://sunsite.unc.edu/LDP/HOWTO/WWW-HOWTO.html>. There
- are also HTML versions available on Sunsite in a tar archive.
-
- 2. Setting up WWW client software
-
- The following chapter is dedicated to the setting up web browsers.
- Please feel free to contact me, if your favorite web browser is not
- mentioned here. In this version of the document only a few of the
- browsers have there own section, but I tried to include all of them
- (all I could find) in the overview section. In the future those
- browsers that deserve there own section will have it.
-
- The overview section is designed to help you decide which browser to
- use, and give you basic information on each browser. The detail
- section is designed to help you install, configure, and maintain the
- browser.
-
- Personally, I prefer the Netscape; it is the only browser that keeps
- up with the latest things in HTML. For example, Frames, Java,
- Javascript, style sheets, secure transactions, and layers. Nothing is
- worse than trying to visit a web site and finding out that you can't
- view it because your browser doesn't support some new feature.
-
- However I use Lynx when I don't feel like firing up the X-
- windows/Netscape monster.
-
- 2.1. Overview
-
- ``Navigator/Communicator''
- Netscape Navigator is the only browser mentioned here, which is
- capable of advanced HTML features. Some of these features are
- frames, Java, Javascript, automatic update, and layers. It also
- has news and mail capability. But it is a resource hog; it
- takes up lots of CPU time and memory. It also sets up a
- separate cache for each user wasting disk space. Netscape is a
- commercial product. Companies have a 30 day trial period, but
- there is no limit for individuals. I would encourage you to
- register anyway to support Netscape in there efforts against
- Microsoft (and what is a measly $40US). My guess is if
- Microsoft wins, we will be forced to use MS Internet Explorer on
- a Windows platform :(
-
- ``Lynx''
- Lynx is the one of the smallest web browsers. It is the king of
- text based browsers. It's free and the source code is available
- under the GNU public license. It's text based, but it has many
- special features.
-
- Kfm
- Kfm is part of the K Desktop Environment (KDE). KDE is a system
- that runs on top of X-windows. It gives you many features like
- drag an drop, sounds, a trashcan and a unified look and feel.
- Kfm is the K File Manager, but it is also a web browser. Don't
- be fooled by the name, for a young product it is very usable as
- a web browser. It already supports frames, tables, ftp
- downloads, looking into tar files, and more. The current
- version of Kfm is 1.39, and it's free. Kfm can be used without
- KDE, but you still need the librarys that come with KDE. For
- more information about KDE and Kfm visit the KDE website at
- <http://www.kde.org>.
-
- ``Emacs''
- Emacs is the one program that does everything. It is a word
- processor, news reader, mail reader, and web browser. It has a
- steep learning curve at first, because you have to learn what
- all the keys do. The X-windows version is easier to use,
- because most of the functions are on menus. Another drawback is
- that it's mostly text based. (It can display graphics if you are
- running it under X-windows). It is also free, and the source
- code is available under the GNU public license.
-
- NCSA Mosaic
- Mosaic is an X-windows browser developed by the National Center
- for Supercomputing Applications (NCSA) at the University of
- Illinois. NCSA spent four years on the project and has now
- moved on to other things. The latest version is 2.6 which was
- released on July 7, 1995. Source code is available for non-
- commercial use. Spyglass Inc. <http://www.spyglass.com> has the
- commercial rights to Mosaic. Its a solid X-windows browser, but
- it lacks the new HTML features. For more info visit the NCSA
- Mosaic home page at
- <http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/>. The software
- can be downloaded from
- <ftp://ftp.ncsa.uiuc.edu/Mosaic/Unix/binaries/2.6/Mosaic-
- linux-2.6.Z>.
-
- Arena
- Arena was a X-windows concept browser for the W3C (World Wide
- Web Consortium) when they were testing HTML 3.0. Hence it
- supports all the HTML 3.0 standards such as style sheets and
- tables. Development was taken over by Yggdrasil Computing, with
- the idea to turn it into a full fledge free X-windows browser.
- However development has stopped in Feb 1997 with version 0.3.11.
- Only part of the HTML 3.2 standard has been implemented. The
- source code is released under the GNU public licence. For more
- information see the web site at
- <http://www.yggdrasil.com/Products/Arena/>. It can be
- downloaded from <ftp://ftp.yggdrasil.com/pub/dist/web/arena/>.
-
- Amaya
- Amaya is the X-windows concept browser for the W3C for HTML 3.2.
- Therefore it supports all the HTML 3.2 standards. It also
- supports some of the features of HTML 4.0. It supports tables,
- forms, client side image maps, put publishing, gifs, jpegs, and
- png graphics. It is both a browser and authoring tool. The
- latest public release is 1.0 beta. Version 1.1 beta is in
- internal testing and is due out soon. For more information
- visit the Amaya web site at <http://www.w3.org/Amaya/>. It can
- be downloaded from <ftp://ftp.w3.org/pub/Amaya-LINUX-
- ELF-1.0b.tar.gz>.
-
- Red Baron
- Red Baron is an X-windows browser made by Red Hat Software. It
- is bundled with The Official Red Hat Linux distribution. I
- could not find much information on it, but I know it supports
- frames, forms and SSL. If you use Red Baron, please help me
- fill in this section. For more information visit the Red Hat
- website at <http://www.redhat.com>
-
- Chimera
- Chimera is a basic X-windows browser. It supports some of the
- features of HTML 3.2. The latest release is 2.0 alpha 6
- released August 27, 1997. For more information visit the
- Chimera website at <http://www.unlv.edu/chimera/>. Chimera can
- be downloaded from <ftp://ftp.cs.unlv.edu/pub/chimera-
- alpha/chimera-2.0a6.tar.gz>.
-
- Qweb
- Qweb is yet another basic X-windows browser. It supports
- tables, forms, and server site image maps. The latest version
- is 1.3. For more information visit the Qweb website at
- <http://sunsite.auc.dk/qweb/> The source is available from
- <http://sunsite.auc.dk/qweb/qweb-1.3.tar.gz> The binaries are
- available in a Red Hat RPM from
- <http://sunsite.auc.dk/qweb/qweb-1.3-1.i386.rpm>
-
- Grail
- Grail is an X-windows browser developed by the Corporation for
- National Research Initiatives (CNRI). Grail is written entirely
- in Python, a interpreted object-oriented language. The latest
- version is 0.3 released on May 7, 1997. It supports forms,
- bookmarks, history, frames, tables, and many HTML 3.2 things.
-
- Internet Explorer
- There are rumors, that Microsoft is going to port the Internet
- Explorer to various Unix platforms - maybe Linux. If its true
- they are taking their time doing it. If you know something more
- reliable, please drop me an e-mail.
-
- In my humble opinion most of the above software is unusable for
- serious web browsing. I'm not trying to discredit the authors, I know
- they worked very hard on these projects. Just think, if all of these
- people had worked together on one project, maybe we would have a free
- browser that would rival Netscape and Internet Explorer.
-
- In my opinion out of all of the broswers, Netscape and Lynx are the
- best. The runners up would be Kfm, Emacs-W3 and Mosaic.
-
- 3. Lynx
-
- Lynx is one of the smaller (around 600 K executable) and faster web
- browsers available. It does not eat up much bandwidth nor system
- resources as it only deals with text displays. It can display on any
- console, terminal or xterm. You will not need an X Windows system or
- additional system memory to run this little browser.
-
- 3.1. Where to get
-
- Both the Red Hat and Slackware distributions have Lynx in them.
- Therefore I will not bore you with the details of compiling and
- installing Lynx.
-
- The latest version is 2.7.1 and can be retrieved from
- <http://www.slcc.edu/lynx/fote/> or from almost any friendly Linux FTP
- server like ftp://sunsite.unc.edu under /pub/Linux/apps/www/broswers/
- or mirror site.
-
- For more information on Lynx try these locations:
-
- Lynx Links
- <http://www.crl.com/~subir/lynx.html>
-
- Lynx Pages
- <http://lynx.browser.org>
-
- Lynx Help Pages
- <http://www.crl.com/~subir/lynx/lynx_help/lynx_help_main.html>
- (the same pages you get from lynx --help and typing ? in lynx)
-
- Note: The Lynx help pages have recently moved. If you have an older
- version of Lynx, you will need to change your lynx.cfg (in /usr/lib)
- to point to the new address(above).
-
- I think the most special feature of Lynx against all other web
- browsers is the capability for batch mode retrieval. One can write a
- shell script which retrieves a document, file or anything like that
- via http, FTP, gopher, WAIS, NNTP or file:// - url's and save it to
- disk. Furthermore, one can fill in data into HTML forms in batch mode
- by simply redirecting the standard input and using the -post_data
- option.
-
- For more special features of Lynx just look at the help files and the
- man pages. If you use a special feature of Lynx that you would like
- to see added to this document, let me know.
-
- 4. Emacs-W3
-
- There are several different flavors of Emacs. The two most popular
- are GNU Emacs and XEmacs. GNU Emacs is put out by the Free Software
- Foundation, and is the original Emacs. It is mainly geared toward
- text based terminals, but it does run in X-Windows. XEmacs (formerly
- Lucid Emacs) is a version that only runs on X-Windows. It has many
- special features that are X-Windows related (better menus etc).
-
- 4.1. Where to get
-
- Both the Red Hat and Slackware distributions include GNU Emacs.
-
- The most recent GNU emacs is 19.34. It doesn't seem to have a web
- site. The FTP site is at <ftp://ftp.gnu.ai.mit.edu/pub/gnu/>.
-
- The latest version of XEmacs is 20.2. The XEmacs FTP site is at
- <ftp://ftp.xemacs.org/pub/xemacs>. For more information about XEmacs
- goto see its web page at <http://www.xemacs.org>.
-
- Both are available from the Linux archives at ftp://sunsite.unc.edu
- under /pub/Linux/apps/editors/emacs/
-
- If you got GNU Emacs or XEmacs installed, you probably got the W3
- browser running to.
-
- The Emacs W3 mode is a nearly fully featured web browser system
- written in the Emacs Lisp system. It mostly deals with text, but can
- display graphics, too - at least - if you run the emacs under the X
- Window system.
-
- To get XEmacs in to W3 mode, goto the apps menu and select browse the
- web.
-
- I don't use Emacs, so if someone will explain how to get it into the
- W3 mode I'll add it to this document. Most of this information was
- from the original author. If any information is incorrect, please let
- me know. Also let me know if you think anything else should be added
- about Emacs.
-
- 5. Netscape Navigator/Communicator
-
- 5.1. Different versions and options.
-
- Netscape Navigator is the King of WWW browsers. Netscape Navigator
- can do almost everything. But on the other hand, it is one of the most
- memory hungry and resource eating program I've ever seen.
-
- There are 3 different versions of the program:
-
- Netscape Navigator includes the web browser, netcaster (push client)
- and a basic mail program.
-
- Netscape Communicator includes the web browser, a web editor, an
- advanced mail program, a news reader, netcaster (push client), and a
- group conference utility.
-
- Netscape Communicator Pro includes everything Communicator has plus a
- group calendar, IBM terminal emulation, and remote administration
- features (administrators can update thousands of copies of Netscape
- from their desk).
-
- In addition to the three versions there are two other options you must
- pick.
-
- The first is full install or base install. The full install includes
- everything. The base install includes enough to get you started. You
- can download the additional components as you need them (such as
- multimedia support and netcaster). These components can be installed
- by the Netscape smart update utility (after installing goto
- help->software updates). At this time the full install is not
- available for Linux.
-
- The second option is import or export. If you are from the US are
- Canada you have the option of selecting the import version. This
- gives you the stronger 128 bit encryption for secure transactions
- (SSL). The export version only has 40 bit encryption, and is the only
- version allowed outside the US and Canada.
-
- The latest version of the Netscape Navigator/Communicator/Communicator
- Pro is 4.03. There are two different versions for Linux. One is for
- the old 1.2 series kernels and one for the new 2.0 kernels. If you
- don't have a 2.0 kernel I suggest you upgrade; there are many
- improvements in the new kernel.
-
- Beta versions are also available. If you try a beta version, they
- usually expire in a month or so!
-
- 5.2. Where to get
-
- The best way to get Netscape software is to go through their web site
- at <http://www.netscape.com/download/>. They have menu's to guide you
- through the selection. When it ask for the Linux version, it is
- referring to the kernel (most people should be using 2.0 by now). If
- your not sure which version kernel you have run 'cat /proc/version'.
- Going through the web site is the only way to get the import versions.
-
- If you want an export version you can download them directly from the
- Netscape FTP servers. The FTP servers are also more up to date. For
- example when I first wrote this the web interface did not have the
- non-beta 4.03 for Linux yet, but it was on the FTP site. Here are the
- links to the export Linux 2.0 versions:
-
- Netscape Navigator 4.03 is at
- <ftp://ftp.netscape.com/pub/communicator/4.03/shipping/english/unix/linux20/navigator_standalone/navigator-
- v403-export.x86-unknown-linux2.0.tar.gz>
-
- Netscape Communicator 4.03 for Linux 2.0 (kernel) is at
- <ftp://ftp.netscape.com/pub/communicator/4.03/shipping/english/unix/linux20/base_install/communicator-
- v403-export.x86-unknown-linux2.0.tar.gz>
-
- Communicator Pro 4.03 for Linux was not available at the time I wrote
- this.
-
- These url's will change as new versions come out. If these links
- break you can find them by fishing around at the FTP site
- <ftp://ftp.netscape.com/pub/communicator/>.
-
- These servers are heavily loaded at times. Its best to wait for off
- peak hours or select a mirror site. Be prepared to wait, these
- archives are large. Navigator is almost 8megs, and Communicator base
- install is 10megs.
-
- 5.3. Installing
-
- This section explains how to install version 4 of Netscape Navigator,
- Communicator, and Communicator Pro.
-
- First unpack the archive to a temporary directory. Then run the ns-
- install script (type ./ns-install). Then make a symbolic link from
- the /usr/local/netscape/netscape binary to /usr/local/bin/netscape
- (type ln -s /usr/local/netscape/netscape /usr/local/bin/netscape).
- Finally set the system wide environment variable $MOZILLA_HOME to
- /usr/local/netscape so Netscape can find its files. If you are using
- bash for your shell edit your /etc/profile and add the lines:
-
- MOZILLA_HOME="/usr/local/netscape"
- export MOZILLA_HOME
-
- After you have it installed the software can automatically update
- itself with smart update. Just run Netscape as root and goto
- help->software updates. If you only got the base install, you can
- also install the Netscape components from there.
-
- Note: This will not remove any old versions of Netscape, you must
- manually remove them by deleting the Netscape binary and Java class
- file (for version 3).
-
- 6. Setting up WWW server systems
-
- This section contains information on different http server software
- packages and additional server side tools like script languages for
- CGI programs etc. There are several dozen web servers, I only covered
- those that are fully functional. As some of these are commercial
- products, I have no way of trying them. Most of the information in
- the overview section was pieced together from various web sites. If
- there is any incorrect or missing information please let me know.
-
- For a technical description on the http mechanism, take a look at the
- RFC documents mentioned in the chapter "For further reading" of this
- HOWTO.
-
- I prefer to use the Apache server. It has almost all the features you
- would ever need and its free! I will admit that this section is
- heavily biased toward Apache. I decided to concentrate my efforts on
- the Apache section rather than spread it out over all the web servers.
- I may cover other web servers in the future.
-
- 6.1. Overview
-
- Cern httpd
- This was the first web server. It was developed by the European
- Laboratory for Particle Physics (CERN). CERN httpd is no longer
- supported. The CERN httpd server is reported to have some ugly
- bugs, to be quite slow and resource hungry. The latest version
- is 3.0. For more information visit the CERN httpd home page at
- <http://www.w3.org/Daemon/Status.html>. It is available for
- download at
- <ftp://sunsite.unc.edu/pub/Linux/apps/www/servers/httpd-3.0.term.tpz>
- (no it is not a typo, the extension is actually .tpz on the
- site; probably should be .tgz)
-
- NCSA HTTPd
- The NCSA HTTPd server is the father to Apache (The development
- split into two different servers). Therefore the setup files
- are very similar. NCSA HTTPd is free and the source code is
- available. This server not covered in this document, although
- reading the Apache section may give you some help. The NCSA
- server was once popular, but most people are replacing it with
- Apache. Apache is a drop in replacement for the NCSA
- server(same configuration files), and it fixes several
- shortcomings of the NCSA server. NCSA HTTPd accounts for 4.9%
- (and falling) of all web servers. (source September 1997
- Netcraft survey <http://www.netcraft.com/survey/>). The latest
- version is 1.5.2a. For more information see the NCSA website at
- <http://hoohoo.ncsa.uiuc.edu>.
-
- ``Apache''
- Apache is the king of all web servers. Apache and its source
- code is free. Apache is modular, therefore it is easy to add
- features. Apache is very flexible and has many, many features.
- Apache and its derivatives makes up 44% of all web domains (50%
- if you count all the derivatives). There are over 695,000
- Apache servers in operation (source November 1997 Netcraft
- survey <http://www.netcraft.com/survey/>).
-
- The official Apache is missing SSL, but there are two
- derivatives that fill the gap. Stronghold is a commercial
- product that is based on Apache. It retails for $995; an
- economy version is available for $495 (based on an old version
- of Apache). Stronghold is the number two secure server behind
- Netscape (source C2 net <http://www.c2.net/products/stronghold>
- and Netcraft survey <http://www.netcraft.com/survey/>). For
- more information visit the Stronghold website at
- <http://www.c2.net/products/stronghold/>. It was developed
- outside the US, so it is available with 128 bit SSL everywhere.
-
- Apache-SSL is a free implementation of SSL, but it is not for
- commercial use in the US (RSA has US patents on SSL technology).
- It can be used for non-commercial use in the US if you link with
- the free RSAREF library. For more information see the website
- at <http://www.algroup.co.uk/Apache-SSL/>.
-
- Netscape Fast Track Server
- Fast Track was developed by Netscape, but the Linux version is
- put out by Caldera. The Caldera site lists it as Fast Track for
- OpenLinux. I'm not sure if it only runs on Caldera OpenLinux or
- if any Linux distribution will do (E-mail me if you have the
- answer). Netscape servers account for 11.5% (and falling) of
- all web servers (source September 1997
- <http://www.netcraft.com/survey/>). The server sells for $295.
- It is also included with the Caldera OpenLinux Standard
- distribution which sells for $399 ($199.50 educational). The
- web pages tell of a nice administration interface and a quick 10
- minute setup. The server has support for 40-bit SSL. To get
- the full 128-bit SSL you need Netscape Enterprise Server.
- Unfortunately that is not available for Linux :( The latest
- version available for Linux is 2.0 (Version 3 is in beta, but
- its not available for Linux yet). To buy a copy goto the
- Caldera web site at
- <http://www.caldera.com/products/netscape/netscape.html> For
- more information goto the Fast Track page at
- <http://www.netscape.com/comprod/server_central/product/fast_track/>
-
- WN WN has many features that make it attractive. First it is
- smaller than the CERN, NCSA HTTPd, an Apache servers. It also
- has many built-in features that would require CGI's. For
- example site searches, enhanced server side includes. It can
- also decompress/compress files on the fly with its filter
- feature. It also has the ability to retrieve only part of a
- file with its ranges feature. It is released under the GNU
- public license. The current version is 1.18.3. For more
- information see the WN website at <http://hopf.math.nwu.edu/>.
-
- AOLserver
- AOLserver is made by America Online. I'll admit that I was
- surprised by the features of a web server coming from AOL. In
- addition to the standard features it supports database
- connectivity. Pages can query a database by Structured Query
- Language (SQL) commands. The database is access through Open
- Database Connectivity (ODBC). It also has built-in search
- engine and TCL scripting. If that is not enough you can add
- your own modules through the c Application Programming Interface
- (API). I almost forgot to mention support for 40 bit SSL. And
- you get all this for free! For more information visit the
- AOLserver site at <http://www.aolserver.com/server/>
-
- Zeus Server
- Zeus Server was developed by Zeus Technology. They claim that
- they are the fastest web server (using WebSpec96 benchmark).
- The server can be configured and controlled from a web browser!
- It can limit processor and memory resources for CGI's, and it
- executes them in a secure environment (whatever that means). It
- also supports unlimited virtual servers. It sells for $999 for
- the standard version. If you want the secure server (SSL) the
- price jumps to $1699. They are based outside the US so 128 bit
- SSL is available everywhere. For more information visit the
- Zeus Technology website at <http://www.zeus.co.uk>. The US
- website is at <http://www.zeus.com>. I'll warn you they are
- cocky about the fastest web server thing. But they don't even
- show up under top web servers in the Netcraft Surveys.
-
- CL-HTTP
- CL-HTTP stands for Common Lisp Hypermedia Server. If you are a
- Lisp programmer this server is for you. You can write your CGI
- scripts in Lisp. It has a web based setup function. It also
- supports all the standard server features. CL-HTTP is free and
- the source code is available. For more information visit the
- CL-HTTP website at <http://www.ai.mit.edu/projects/iiip/doc/cl-
- http/home-page.html> (could they make that url any longer?).
-
- If you have a commercial purpose (company web site, or ISP), I would
- strongly recommend that you use Apache. If you are looking for easy
- setup at the expense of advanced features then the Zeus Server wins
- hands down. I've also heard that the Netscape Server is easy to
- setup. If you have an internal use you can be a bit more flexible.
- But unless one of them has a feature that you just have to use, I
- would still recommend using one of the three above.
-
- This is only a partial listing of all the servers available. For a
- more complete list visit Netcraft at
- <http://www.netcraft.com/survey/servers.html> or Web Compare at
- <http://webcompare.internet.com>.
-
- 7. Apache
-
- The current version of Apache is 1.2.4. Version 1.3 is in beta
- testing. The main Apache site is at <http://www.apache.org/>.
- Another good source of information is Apacheweek at
- <http://www.apacheweek.com/>. The Apache documentation is ok, so I'm
- not going to go into detail in setting up apache. The documentation
- is on the website and is included with the source (in HTML format).
- There are also text files included with the source, but the HTML
- version is better. The documentation should get a whole lot better
- once the Apache Documentation Project gets under way. Right now most
- of the documents are written by the developers. Not to discredit the
- developers, but they are a little hard to understand if you don't know
- the terminology.
-
- 7.1. Where to get
-
- Apache is included in the Red Hat, Slackware, and OpenLinux
- distributions. Although they may not be the latest version, they are
- very reliable binaries. The bad news is you will have to live with
- their directory choices (which are totally different from each other
- and the Apache defaults).
-
- The source is available from the Apache web site at
- <http://www.apache.org/dist/> Binaries are are also available at
- apache at the same place. You can also get binaries from sunsite at
- <ftp://sunsite.unc.edu/pub/Linux/apps/www/servers/>. And for those of
- us running Red Hat the latest binary RPM file can usually be found in
- the contrib directory at <ftp://ftp.redhat.com/pub/contrib/i386/>
-
- If your server is going to be used for commercial purposes, it is
- highly recommended that you get the source from the Apache website and
- compile it yourself. The other option is to use a binary that comes
- with a major distribution. For example Slackware, Red Hat, or
- OpenLinux distributions. The main reason for this is security. An
- unknown binary could have a back door for hackers, or an unstable
- patch that could crash your system. This also gives you more control
- over what modules are compiled in, and allows you to set the default
- directories. It's not that difficult to compile Apache, and besides
- you not a real Linux user until you compile your own programs ;)
-
- 7.2. Compiling and Installing
-
- First untar the archive to a temporary directory. Next change to the
- src directory. Then edit the Configuration file if you want to
- include any special modules. The most commonly used modules are
- already included. There is no need to change the rules or makefile
- stuff for Linux. Next run the Configure shell script (./Configure).
- Make sure it says Linux platform and gcc as the compiler. Next you
- may want to edit the httpd.h file to change the default directories.
- The server home (where the config files are kept) default is
- /usr/local/etc/httpd/, but you may want to change it to just
- /etc/httpd/. And the server root (where the HTML pages are served
- from) default is /usr/local/etc/httpd/htdocs/, but I like the
- directory /home/httpd/html (the Red Hat default for Apache). If you
- are going to be using su-exec (see special features below) you may
- want to change that directory too. The server root can also be
- changed from the config files too. But it is also good to compile it
- in, just encase Apache can't find or read the config file. Everything
- else should be changed from the config files. Finally run make to
- compile Apache.
-
- If you run in to problems with include files missing, check the
- following things. Make sure you have the kernel headers (include
- files) installed for your kernel version. Also make sure you have
- these symbolic links in place:
-
- /usr/include/linux should be a link to /usr/src/linux/include/linux
- /usr/include/asm should be a link to /usr/src/linux/include/asm
- /usr/src/linux should be a link to the Linux source directory (ex.linux-2.0.30)
-
- Links can be made with ln -s, it works just like the cp command except
- it makes a link (ln -s source-dir destination-link)
-
- When make is finished there should be an executable named httpd in the
- directory. This needs to be moved in to a bin directory. /usr/sbin
- or /usr/local/sbin would be good choices.
-
- Copy the conf, logs, and icons sub-directories from the source to the
- server home directory. Next rename 3 of the files files in the conf
- sub-directory to get rid of the -dist extension (ex. httpd.conf-dist
- becomes httpd.conf)
-
- There are also several support programs that are included with Apache.
- They are in the support directory and must be compiled and installed
- separately. Most of them can be make by using the makefile in that
- directory (which is made when you run the main Configure script). You
- don't need any of them to run Apache, but some of them make the
- administrators job easier.
-
- 7.3. Configuring
-
- Now you should have four files in your conf sub-directory (under your
- server home directory). The httpd.conf sets up the server daemon
- (port number, user, etc). The srm.conf sets the root document tree,
- special handlers, etc. The access.conf sets the base case for access.
- Finally mime.types tells the server what mime type to send to the
- browser for each extension.
-
- The configuration files are pretty much self-documented (plenty of
- comments), as long as you understand the lingo. You should read
- through them thoroughly before putting your server to work. Each
- configuration item is covered in the Apache documentation.
-
- The mime.types file is not really a configuration file. It is used by
- the server to translate file extensions into mime-types to send to the
- browser. Most of the common mime-types are already in the file. Most
- people should not need to edit this file. As time goes on, more mime
- types will be added to support new programs. The best thing to do is
- get a new mime-types file (and maybe a new version of the server) at
- that time.
-
- Always remember when you change the configuration files you need to
- restart Apache or send it the SIGHUP signal with kill for the changes
- to take effect. Make sure you send the signal to the parent process
- and not any of the child processes. The parent usually has the lowest
- process id number. The process id of the parent is also in the
- httpd.pid file in the log directory. If you accidently send it to one
- of the child processes the child will die and the parent will restart
- it.
-
- I will not be walking you through the steps of configuring Apache.
- Instead I will deal with specific issues, choices to be made, and
- special features.
-
- I highly recommend that all users read through the security tips in
- the Apache documentation. It is also available from the Apache
- website at <http://www.apache.org/docs/mics/security_tips.html>.
-
- 7.4. Hosting virtual websites
-
- Virtual Hosting is when one computer has more than one domain name.
- The old way was to have each virtual host have its own IP address.
- The new way uses only one IP address, but it doesn't work correctly
- with browsers that don't support HTTP 1.1.
-
- My recommendation for businesses is to go with the IP based virtual
- hosting until most people have browsers that support HTTP 1.1 (give it
- a year or two). This also gives you a more complete illusion of
- virtual hosting. While both methods can give you virtual mail
- capabilities (can someone confirm this?), only IP based virtual
- hosting can also give you virtual FTP as well.
-
- If it is for a club or personal page, you may want to consider shared
- IP virtual hosting. It should be cheaper than IP based hosting and
- you will be saving precious IP addresses.
-
- You can also mix and match IP and shared IP virtual hosts on the same
- server. For more information on virtual hosting visit Apacheweek at
- <http://www.apacheweek.com/features/vhost>.
-
- 7.4.1. IP based virtual hosting
-
- In this method each virtual host has its own IP address. By
- determining the IP address that the request was sent to, Apache and
- other programs can tell what domain to serve. This is an incredible
- waste of IP space. Take for example the servers where my virtual
- domain is kept. They have over 35,000 virtual accounts, that means
- 35,000 IP addresses. Yet I believe at last count they had less than
- 50 servers running.
-
- Setting this up is a two part process. The first is getting Linux
- setup to accept more than one IP address. The second is setting up
- apache to serve the virtual hosts.
-
- The first step in setting up Linux to accept multiple IP addresses is
- to make a new kernel. This works best with a 2.0 series kernel (or
- higher). You need to include IP networking and IP aliasing support.
- If you need help with compiling the kernel see the kernel howto
- <http://sunsite.unc.edu/LDP/HOWTO/Kernel-HOWTO.html>.
-
- Next you need to setup each interface at boot. If you are using the
- Red Hat Distribution then this can be done from the control panel.
- Start X-windows as root, you should see a control panel. Then double
- click on network configuration. Next goto the interfaces panel and
- select your network card. Then click alias at the bottom of the
- screen. Fill in the information and click done. This will need to be
- done for each virtual host/IP address.
-
- If you are using other distributions you may have to do it manually.
- You can just put the commands in the rc.local file in /etc/rc.d
- (really they should go in with the networking stuff). You need to
- have a ifconfig and route command for each device. The aliased
- addresses are given a sub device of the main one. For example eth0
- would have aliases eth0:0, eth0:1, eth0:2, etc. Here is an example of
- configuring a aliased device:
-
- ifconfig eth0:0 192.168.1.57
- route add -host 192.168.1.57 dev eth0:0
-
- You can also add a broadcast address and a netmask to the ifconfig
- command. If you have alot of aliases you may want to make a for loop
- to make it easier. For more information see the IP alias mini howto
- <http://sunsite.unc.edu/LDP/HOWTO/mini/IP-Alias.html>.
-
- Then you need to setup your domain name server (DNS) to serve these
- new domains. And if you don't already own the domain names, you need
- to contact the Internic <http://www.internic.net> to register the
- domain names. See the DNS-howto for information on setting up your
- DNS.
-
- Finally you need to setup Apache to server the virtual domain
- correctly. This is in the httpd.conf configuration file near the end.
- They give you an example to go by. All commands specific to that
- virtual host are put in between the virtualhost directive tags. You
- can put almost any command in there. Usually you set up a different
- document root, script directory, and log files. You can have almost
- unlimited number of virtual hosts by adding more virtualhost directive
- tags.
-
- In rare cases you may need to run separate servers if a directive is
- needed for a virtual host, but is not allowed in the virtual host
- tags. This is done using the bindaddress directive. Each server
- will have a different name and setup files. Each server only responds
- to one IP address, specified by the bindaddress directive. This is an
- incredible waste of system resources.
-
- 7.4.2. Shared IP virtual hosting
-
- This is a new way to do virtual hosting. It uses a single IP address,
- thus conserving IP addresses for real machines (not virtual ones). In
- the same example used above those 30,000 virtual hosts would only take
- 50 IP addresses (one for each machine). This is done by using the new
- HTTP 1.1 protocol. The browser tells the server which site it wants
- when it sends the request. The problem is browsers that don't support
- HTTP 1.1 will get the servers main page, which could be setup to
- provide a menu of virtual hosts available. That ruins the whole
- illusion of virtual hosting. The illusion that you have your own
- server.
-
- The setup is much simpler than the IP based virtual hosting. You
- still need to get your domain from the Internic and setup your DNS.
- This time the DNS points to the same IP address as the original
- domain. Then Apache is setup the same as before. Since you are using
- the same IP address in the virtualhost tags, it knows you want Shared
- IP virtual hosting.
-
- There are several work arounds for older browsers. I'll explain the
- best one. First you need to make your main pages a virtual host
- (either IP based or shared IP). This frees up the main page for a
- link list to all your virtual hosts. Next you need to make a back
- door for the old browsers to get in. This is done using the
- ServerPath directive for each virtual host inside the virtualhost
- directive. For example by adding ServerPath /mysite/ to
- www.mysite.com old browsers would be able to access the site by
- www.mysite.com/mysite/. Then you put the default page on the main
- server that politely tells them to get a new browser, and lists links
- to all the back doors of all the sites you host on that machine. When
- an old browser accesses the site they will be sent to the main page,
- and get a link to the correct page. New browsers will never see the
- main page and will go directly to the virtual hosts. You must
- remember to keep all of your links relative within the web sites,
- because the pages will be accessed from two different URL's
- (www.mysite.com and www.mysite.com/mysite/).
-
- I hope I didn't lose you there, but its not an easy workaround. Maybe
- you should consider IP based hosting after all. A very similar
- workaround is also explained on the apache website at
- <http://www.apache.org/manual/host.html>.
-
- If anyone has a great resource for Shared IP hosting, I would like to
- know about it. It would be nice to know what percent of browsers out
- there support HTTP 1.1, and to have a list of which browsers and
- versions support HTTP 1.1.
-
- 7.5. CGI scripts
-
- There are two different ways to give your users CGI script capability.
- The first is make everything ending in .cgi a CGI script. The second
- is to make script directories (usually named cgi-bin). You could also
- use both methods. For either method to work the scripts must be world
- executable (chmod 711). By giving your users script access you are
- creating a big security risk. Be sure to do your homework to minimize
- the security risk.
-
- I prefer the first method, especially for complex scripting. It
- allows you to put scripts in any directory. I like to put my scripts
- with the web pages they work with. For sites with allot of scripts it
- looks much better than having a directory full of scripts. This is
- simple to setup. First uncomment the .cgi handler at the end of the
- srm.conf file. Then make sure all your directories have the option
- ExecCGI or All in the access.conf file.
-
- Making script directories is considered more secure. To make a script
- directory you use the ScriptAlias directive in the srm.conf file. The
- first argument is the Alias the second is the actual directory. For
- example ScriptAlias /cgi-bin/ /usr/httpd/cgi-bin/ would make
- /usr/httpd/cgi-bin able to execute scripts. That directory would be
- used whenever someone asked for the directory /cgi-bin/. For security
- reasons you should also change the properties of the directory to
- Options none, AllowOveride none in the access.conf (just uncomment the
- example that is there). Also do not make your script directories
- subdirectories of your web page directories. For example if you are
- serving pages from /home/httpd/html/, don't make the script directory
- /home/httpd/html/cgi-bin; Instead make it /home/httpd/cgi-bin.
-
- If you want your users to have there own script directories you can
- use multiple ScriptAlias commands. Virtual hosts should have there
- ScriptAlias command inside the virtualhost directive tags. Does
- anyone know a simple way to allow all users to have a cgi-bin
- directory without individual ScriptAlias commands?
-
- 7.6. Users Web Directories
-
- There are two different ways to handle user web directories. The
- first is to have a subdirectory under the users home directory
- (usually public_html). The second is to have an entirely different
- directory tree for web directories. With both methods make sure set
- the access options for these directories in the access.conf file.
-
- The first method is already setup in apache by default. Whenever a
- request for /~bob/ comes in it looks for the public_html directory in
- bob's home directory. You can change the directory with the UserDir
- directive in the srm.conf file. This directory must be world readable
- and executable. This method creates a security risk because for
- Apache to access the directory the users home directory must be world
- executable.
-
- The second method is easy to setup. You just need to change the
- UserDir directive in the srm.conf file. It has many different
- formats; you may want to consult the Apache documentation for
- clarification. If you want each user to have their own directory
- under /home/httpd/, you would use UserDir /home/httpd. Then when the
- request /~bob/ comes in it would translate to /home/httpd/bob/. Or if
- you want to have a subdirectory under bob's directory you would use
- UserDir /home/httpd/*/html. This would translate to
- /home/httpd/bob/html/ and would allow you to have a script directory
- too (for example /home/httpd/bob/cgi-bin/).
-
- 7.7. Daemon mode vs. Inetd mode
-
- There are two ways that apache can be run. One is as a daemon that is
- always running (Apache calls this standalone). The second is from the
- inetd super-server.
-
- Daemon mode is far superior to inetd mode. Apache is setup for daemon
- mode by default. The only reason to use the inetd mode is for very
- low use applications. Such as internal testing of scripts, small
- company Intranet, etc. Inetd mode will save memory because apache
- will be loaded as needed. Only the inetd daemon will remain in
- memory.
-
- If you don't use apache that often you may just want to keep it in
- daemon mode and just start it when you need it. Then you can kill it
- when you are done (be sure to kill the parent and not one of the child
- processes).
-
- To setup inetd mode you need to edit a few files. First in
- /etc/services see if http is already in there. If its not then add
- it:
-
- http 80/tcp
-
- Right after 79 (finger) would be a good place. Then you need to edit
- the /etc/inetd.conf file and add the line for Apache:
-
- http stream tcp nowait root /usr/sbin/httpd httpd
-
- Be sure to change the path if you have Apache in a different location.
- And the second httpd is not a typo; the inet daemon requires that. If
- you are not currently using the inet daemon, you may want to comment
- out the rest of the lines in the file so you don't activate other ser¡
- vices as well (FTP, finger, telnet, and many other things are usually
- run from this daemon).
-
- If you are already running the inet deamon (inetd), then you only need
- to send it the SIGHUP signal (via kill; see kill's man page for more
- info) or reboot the computer for changes to take effect. If you are
- not running inetd then you can start it manually. You should also add
- it to your init files so it is loaded at boot (the rc.local file may
- be a good choice).
-
- 7.8. Allowing put and delete commands
-
- The newer web publishing tools support this new method of uploading
- web pages by http (instead of FTP). Some of these products don't even
- support FTP anymore! Apache does support this, but it is lacking a
- script to handle the requests. This script could be a big security
- hole, be sure you know what you are doing before attempting to write
- or install one.
-
- If anyone knows of a script that works let me know and I'll include
- the address to it here.
-
- For more information goto Apacheweek's article at
- <http://www.apacheweek.com/features/put>.
-
- 7.9. User Authentication/Access Control
-
- This is one of my favorite features. It allows you to password
- protect a directory or a file without using CGI scripts. It also
- allows you to deny or grant access based on the IP address or domain
- name of the client. That is a great feature for keeping jerks out of
- your message boards and guest books (you get the IP or domain name
- from the log files).
-
- To allow user authentication the directory must have AllowOverrides
- AuthConfig set in the access.conf file. To allow access control (by
- domain or IP address) AllowOverrides Limit must be set for that
- directory.
-
- Setting up the directory involves putting an .htaccess file in the
- directory. For user authentication it is usually used with an
- .htpasswd and optionally a .htgroup file. Those files can be shared
- among multiple .htaccess files if you wish.
-
- For security reasons I recommend that everyone use these directives in
- there access.conf file:
-
- <files ~ "/\.ht">
- order deny,allow
- deny from all
- </files>
-
- If you are not the administrator of the system you can also put it in
- your This directive will prevent people from looking into your access
- control files (.htaccess, .htpasswd, etc).
-
- There are many different options and file types that can be used with
- access control. Therefore it is beyond the scope of this document to
- describe the files. For information on how to setup User
- Authentication see the Apacheweek feature at
- <http://www.apacheweek.com/features/userauth> or the NCSA pages at
- <http://hoohoo.ncsa.uiuc.edu/docs-1.5/tutorials/user.html>.
-
- 7.10. su-exec
-
- The su-exec feature runs CGI scripts as the user of the owner.
- Normally it is run as the user of the web server (usually nobody).
- This allows users to access there own files in CGI scripts without
- making them world writable (a security hole). But if you are not
- careful you can create a bigger security hole by using the su-exec
- code. The su-exec code does security checks before executing the
- scripts, but if you set it up wrong you will have a security hole.
-
- The su-exec code is not for amateurs. Don't use it if you don't know
- what you are doing. You could end up with a gaping security hole
- where your users can gain root access to your system. Do not modify
- the code for any reason. Be sure to read all the documentation
- carefully. The su-exec code is hard to setup on purpose, to keep the
- amateurs out (everything must be done manually, no make file no
- install scripts).
-
- The su-exec code resides in the support directory of the source.
- First you need to edit the suexec.h file for your system. Then you
- need to compile the su-exec code with this command:
-
- gcc suexec.c -o suexec
-
- Then copy the suexec executable to the proper directory. The Apache
- default is /usr/local/etc/httpd/sbin/. This can be changed by editing
- httpd.h in the Apache source and recompiling Apache. Apache will only
- look in this directory, it will not search the path. Next the file
- needs to be changed to user root (chown root suexec) and the suid bit
- needs to be set (chmod 4711 suexec). Finally restart Apache, it
- should display a message on the console that su-exec is being used.
-
- CGI scripts should be set world executable like normal. They will
- automaticaly be run as the owner of the CGI script. If you set the
- SUID (set user id) bit on the CGI scripts they will not run. If the
- directory or file is world or group writable the script will not run.
- Scripts owned by system users will not be run (root, bin, etc.). For
- other security conditions that must be met see the su-exec
- documentation. If you are having problems see the su-exec log file
- named cgi.log.
-
- Su-exec does not work if you are running Apache from inetd, it only
- works in daemon mode. It will be fixed in the next version because
- there will be no inetd mode. If you like playing around in source
- code, you can edit the http_main.c. You want to get rid of the line
- where Apache announces that it is using the su-exec wrapper (It
- wrongly prints this in front of the output of everything).
-
- Be sure and read the Apache documentation on su-exec. It is included
- with the source and is available on the Apache web site at
- <http://www.apache.org/docs/suexec.html>
-
- 7.11. Imagemaps
-
- Apache has the ability to handle server side imagemaps. Imagemaps are
- images on webpages that take users to different locations depending on
- where they click. To enable imagemaps first make sure the imagemap
- module is installed (its one of the default modules). Next you need
- to uncomment the .map handler at the end of the srm.conf file. Now
- all files ending in image to separate links. Apache uses map files in
- the standard NCSA format. Here is an example of using a map file in a
- web page:
-
- <a href="/map/mapfile.map">
- <img src="picture.gif" ISMAP>
- </a>
-
- In this example mapfile.map is the mapfile, and picture.gif is the
- image to click on.
-
- There are many programs that can generate NCSA compatible map files or
- you can create them yourself. For a more detailed discussion of
- imagemaps and map files see the Apacheweek feature at
- <http://www.apacheweek.com/features/imagemaps>.
-
- 7.12. SSI/XSSI
-
- Server Side Includes (SSI) adds dynamic content to otherwise static
- web pages. The includes are embedded in the web page as comments.
- The web server then parses these includes and passes the results to
- the web server. SSI can add headers and footers to documents, add
- date the document was last updated, execute a system command or a CGI
- script. With the new eXtended Server Side Includes (XSSI) you can do
- a whole lot more. XSSI adds variables and flow control statements
- (if, else, etc). Its almost like having an programming language to
- work with.
-
- Parsing all HTML files for SSI commands would waste allot of system
- resources. Therefore you need to distinguish normal HTML files from
- those that contain SSI commands. This is usually done by changing the
- extension of the SSI enhanced HTML files. Usually the .shtml
- extension is used.
-
- To enable SSI/XSSI first make sure that the includes module is
- installed. Then edit srm.conf and uncomment the AddType and
- AddHandler directives for you want to run SSI/XSSI files. This is
- done in the access.conf file. Now all files with the extension .shtml
- will be parsed for SSI/XSSI commands.
-
- Another way of enabling includes is to use the XBitHack directive. If
- you turn this on it looks to see if the file is executable by user.
- If it is and Options Includes is on for that directory, then it is
- treated as an SSI file. This only works for files with the mime type
- text/html (.html .htm files). This is not the preferred method.
-
- There is a security risk in allowing SSI to execute system commands
- and CGI scripts. Therefore it is possible to lock that feature out
- with the Option IncludesNOEXEC instead of Option Includes in the
- access.conf file. All the other SSI commands will still work.
-
- For more information see the Apache mod_includes documentation that
- comes with the source. It is also available on the website at
- <http://www.apache.org/docs/mod/mod_include.html>.
-
- For a more detailed discussion of SSI/XSSI implementation see the
- Apacheweek feature at <http://www.apacheweek.com/features/ssi>.
-
- For more information on SSI commands see the NCSA documentation at
- <http://hoohoo.ncsa.uiuc.edu/docs/tutorials/includes.html>.
-
- For more information on XSSI commands goto
- <ftp://pageplus.com/pub/hsf/xssi/xssi-1.1.html>.
-
- 7.13. Module system
-
- Apache can be extended to support almost anything with modules. There
- are allot of modules already in existence. Only the general interest
- modules are included with Apache. For links to existing modules goto
- the
-
- Apache Module Registry at <http://www.zyzzyva.com/module_registry/>.
-
- For module programming information goto
- <http://www.zyzzyva.com/module_registry/reference/>
-
- 8. Web Server Add-ons
-
- Sorry this section has not been written yet.
-
- Coming soon: mSQL, PHP/FI, cgiwrap, Fast-cgi, MS frontpage extentions,
- and more.
-
- 9. FAQ
-
- There aren't any frequent asked questions - yet...
-
- 10. For further reading
-
- 10.1. O'Reilly & Associates Books
-
- In my humble opinion O'Reilly & Associates make the best technical
- books on the planet. They focus mainly on Internet, Unix and
- programming related topics. They start off slow with plenty of
- examples and when you finish the book your an expert. I think you
- could get by if you only read half of the book. They also add some
- humor to otherwise boring subjects.
-
- They have great books on HTML, PERL, CGI Programming, Java,
- JavaScript, C/C++, Sendmail, Linux and much much more. And the fast
- moving topics (like HTML) are updated and revised about every 6 months
- or so. So visit the O'Reilly & Associates <http://www.ora.com/> web
- site or stop by your local book store for more info.
-
- And remember if it doesn't say O'Reilly & Associates on the cover,
- someone else probably wrote it.
-
- 10.2. Internet Request For Comments (RFC)
-
- ╖ RFC1866 written by T. Berners-Lee and D. Connolly, "Hypertext
- Markup Language - 2.0", 11/03/1995
-
- ╖ RFC1867 writtenm by E. Nebel and L. Masinter, "Form-based File
- Upload in HTML", 11/07/1995
- ╖ RFC1942 written by D. Raggett, "HTML Tables", 05/15/1996
-
- ╖ RFC1945 by T. Berners-Lee, R. Fielding, H. Nielsen, "Hypertext
- Transfer Protocol -- HTTP/1.0", 05/17/1996.
-
- ╖ RFC1630 by T. Berners-Lee, "Universal Resource Identifiers in WWW:
- A Unifying Syntax for the Expression of Names and Addresses of
- Objects on the Network as used in the World-Wide Web", 06/09/1994
-
- ╖ RFC1959 by T. Howes, M. Smith, "An LDAP URL Format", 06/19/1996
-
-